-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Add structured output with tools support for models with limited structured output #2071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add structured output with tools support for models with limited structured output #2071
Conversation
support for tools on limited models (e.g., Gemini) Enable using tools and structured outputs together on models (e.g., Gemini) that don't natively support both simultaneously. Introduce an opt-in parameter enable_structured_output_with_tools to the Agent class, which injects JSON formatting instructions into the system prompt for LitellmModel as a workaround. Changes: - Add enable_structured_output_with_tools parameter to Agent (default: False) - Implement prompt injection utilities in src/agents/util/_prompts.py - Update LitellmModel to inject JSON instructions when enabled - Extend model interfaces to accept enable_structured_output_with_tools - Add comprehensive unit tests (13 total) and one integration test - Add documentation in docs/models/structured_output_with_tools.md - Update docs/agents.md and docs/models/litellm.md with usage examples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
when enabled Maintain backward compatibility with third-party Model implementations by only passing the enable_structured_output_with_tools parameter when it's explicitly enabled (True). This prevents TypeErrors in custom Model classes that don't support the new parameter yet. - Built-in models have default False, so they work either way - Third-party models without the parameter won't crash - Feature still works when explicitly enabled Fixes backward compatibility issue raised in code review.
|
@devv-shayan Thanks for looking into this. However, as I mentioned at #2032 (comment), we prefer resolving this only by the changes on the LiteLLM or our LiteLLM adapter side. So, we don't accept this large diff for this. |
|
Sure I'll work on it to make it on litellm adapter side |
Moved the enable_structured_output_with_tools parameter from the Agent class to LitellmModel.__init__() to minimize the diff and isolate changes within the LiteLLM adapter as requested during code review. Changes: - Added enable_structured_output_with_tools parameter to LitellmModel.__init__() - Stored as instance variable and used throughout LitellmModel - Removed parameter from Agent class and related validation - Removed parameter from Model interface (get_response / stream_response) - Removed parameter from Runner (no longer passed to model calls) - Removed parameter from OpenAI model implementations - Reverted test mock models to original signatures - Updated test_gemini_local.py for model-level configuration - Updated documentation to reflect model-level usage Before: Agent(model=..., enable_structured_output_with_tools=True) After: Agent(model=LitellmModel(..., enable_structured_output_with_tools=True))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
prompt The JSON prompt injection was only triggered when tools list was non-empty, but handoffs are converted to function tools and added separately. This meant that agents using only handoffs with output_schema would not get the prompt injection even when enable_structured_output_with_tools=True, causing Gemini to error with 'Function calling with response mime type application/json is unsupported.' Changes: - Combine tools and handoffs before checking if JSON prompt should be injected - Add test case for handoffs-only scenario - Update inline comment to clarify why handoffs must be included This ensures the opt-in flag works correctly for multi-agent scenarios where an agent might use handoffs without regular tools.
180609c to
1d140ff
Compare
seratch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sharing the idea, but adding extra prompt this way is not what we want to have in this SDK. If this works well for you, please use the code as part of your project and/or publish the small module as your own package. Thanks again for your efforts here.
| @@ -0,0 +1,117 @@ | |||
| """Utility functions for generating prompts for structured outputs.""" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for sharing this idea, but this is not what we want to have. if this works well for you, please use this code as part of your project and/or publish the small module as your own package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your feedback, but I noticed the codebase already includes automatic request modifications for API workarounds (_fix_tool_message_ordering for Anthropic) and supports instruction modification via handoff_prompt.py and call_model_input_filter. My implementation follows the same pattern but is opt-in via a flag. I understand the distinction between structural modifications (like _fix_tool_message_ordering which reorders messages) and content modifications (prompt injection). Both are automatic SDK workarounds for API limitations. Could you clarify why structural modifications are acceptable but content modifications are not? Both address provider-specific limitations and improve compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We basically don't append/modify users' prompts. There may be some exception in the future, but for the purpose we discussed here, we'd like to avoid doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Since the SDK avoids it, I’ll withdraw the changes from core. I’ll publish this as a small external extension (wrapper/subclass or a helper used via call_model_input_filter/dynamic instructions) so teams that need Gemini tools + structured outputs can opt in at the app layer. If helpful, I can open a tiny doc PR to link to the community package rather than adding prompt logic here.
Add structured output with tools support for models with limited structured output
Summary
Adds an opt-in structured output with tools feature to
LitellmModel, enabling the use of tools and structured outputs simultaneously on models that don’t natively support both (specifically Google Gemini via LiteLLM).Problem:
Models like Gemini return
BadRequestError: Function calling with a response mime type 'application/json' is unsupportedwhen trying to use tools together with structured outputs (via
response_schema).Solution:
A new
enable_structured_output_with_toolsparameter on theLitellmModelclass that:response_formatto avoid API errorsFalsefor backward compatibilityTest plan
Unit tests (14 tests added):
tests/utils/test_prompts.py– Tests for prompt generation and injection logicIntegration test:
tests/test_gemini_local.py– Manual test script for Gemini (requires API key)Verification performed:
20 failures are pre-existing Windows-specific SQLite file locking issues (not related to this PR). All 14 new tests pass.
Test coverage:
enable_structured_output_with_toolsparameterIssue number
#2032
Checks
make lintandmake formatDocumentation
Added comprehensive documentation:
docs/models/structured_output_with_tools.md– Complete guide with examplesdocs/models/litellm.md– Added Gemini integration examplemkdocs.yml– Added new doc page to navigationFiles Changed
Source code:
src/agents/extensions/models/litellm_model.py– Addedenable_structured_output_with_toolsparameter and implementationsrc/agents/util/_prompts.py– Utility for JSON prompt generationsrc/agents/models/interface.py– Updated for consistencysrc/agents/models/openai_chatcompletions.py– Pass-through (ignores parameter)src/agents/models/openai_responses.py– Pass-through (ignores parameter)src/agents/run.py– No longer handles structured output logicTests:
tests/utils/test_prompts.py– 14 new unit teststests/test_gemini_local.py– Integration testDocumentation:
docs/models/structured_output_with_tools.md– New comprehensive guidedocs/models/litellm.md– Added Gemini examplemkdocs.yml– Updated navigationImplementation Details
Design decisions:
enable_structured_output_with_tools=Falseensures no impact on existing integrations.LitellmModel; OpenAI models continue to use native support.How it works:
enable_structured_output_with_tools=Truewhen initializingLitellmModel.response_formatis disabled to avoid API errors.Example Usage
Backward Compatibility
✅ Fully backward compatible –
FalseLitellmModelwhen explicitly enabled